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Large  Deviations  for  Processes  with  Independent  Increments 
James  Lynch  and  Jayaram  Sethuraman 

Abstract 

Let  X  be  a  topological  space  and  F  denote  the  Borel  a-field  in  X.  A 

family  of  probability  measures  {P^}  is  said  to  obey  the  large 

deviation  principle  (LDP)  with  rate  function  1(0  if  P^(A)  can  be 

suitably  approximated  by  exp  {-X  inf  I ( x ) }  for  appropriate  sets  A  in  F. 

xeA 

Here  the  LDP  is  studied  for  probability  measures  induced  by  stochastic 
processes  with  stationary  and  independent  increments  which  have  no  Gaussian 
component.  It  is  assumed  that  the  moment  generating  function  of  the 
increments  exists  and  thus  the  sample  paths  of  such  stochastic  processes 
lie  in  the  space  of  functions  of  bounded  variation.  The  LDP  for  such 
processes  is  obtained  under  the  weak*- topology .  This  covers  a  case  which 
was  ruled  out  in  the  earlier  work  of  Varadhan  (1966).  As  applications,  the 
large  deviation  principle  for  the  Poisson,  Gamma  and  Dlrichlet  processes 
are  obtained. 


1 .  Introduction. 

Let  X  be  a  topological  space  and  F  denote  the  Borel  o-field  in  X. 

Let  (P^f  be  a  family  of  probability  measures  on  (X,F).  The  family 

{ PA  j-  is  said  to  obey  the  large  deviation  principle  (LDP)  (for  a 

more  precise  definition  see  Section  2)  with  rate  function  I(-)  if 

P  (A)  can  be  approximated  by  exp  {-X  inf  I(x)}  for  appropriate 

xCA 

subsets  A  in  F. 

Important  examples  of  the  LDP  include  the  cases  where  P^  (X  a 
positive  integer)  is  either  (i)  the  probability  measure  induced  by  the 
average  of  X  i.i.d.  random  variables  (see  Chernoff,  1952;  Bahadur  and 
Zabell,  1979;  Varadhan,  1983)  or  (ii)  the  probability  measure  of  the 
empirical  distribution  of  X  i.i.d.  random  variables  (Groeneboom,  Oosterhoff 
and  Ruymgaart,  1979).  In  an  important  paper,  Ellis  (198U)  has  elegantly 
shown  how  to  establish  the  LDP  when  X  -  Rk,  solely  in  terms  of  the  moment 
generating  functions  of  P^.  Further  examples  may  be  found  in  the 
resent  surveys  on  large  deviations  by  Azencott  (1980)  and  by  Varadhan 


(1983). 


7  ,  •  <-  •  -•  '  (  ■  f 

The  establishment  of  the -LDP  has  had  Important  implications  in 


various  areas  in  statistics.  It  has  been  used  to  obtain  the  asymptotic 
efficiencies  of  tests  and  estimates. (Chernoff,  1952;  Bahadur,  1960a, b, 

1967  and  1971)  and  to  obtain  the  asymptotic  behavior  of  functional 
integrals  associated  with  solutions  of  stochastic  integrals,  (Varadhan,  1966 
and  1983).  It  appears  in  the  evaluation  of  the  'free1  energy  in 
statistical  mechanics  (Lanford,  1973;  Ruelle,  1969).  It  is  also  intimately 
related  to  certain  types  of  laws  of  large  numbers  (Shepp,  196^;  Erdos  and 
Rdnyi,  1970). 
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In  this  paper  we  study  the  LDP  for  a  stochastic  process  with 
stationary  independent  increments  with  no  Gaussian  component  and  obtain 
complete  results.  The  space  X  that  is  appropriate  here  is  BV[0,1],  the 
space  of  functions  of  bounded  variation  and  the  topology  is  that  of 
weak*-convergence.  Varadhan  (1966)  studied  the  LDP  for  similar  processes 
with  possible  Gaussian  components  but  satisfying  the  condition  J(a)/|  a  |  • 

as  |  a  |  ♦  •  where  J(*)  is  as  defined  in  (3.2)  and  is  the  rate  function  based 
on  the  distribution  of  the  increments.  Varadhan  (1966)  used  the  space 
D[0 , 1 ]  and  the  topology  of  uniform  convergence.  However,  the  condition 
J(a)/|  a  |  «  is  violated  for  many  processes  of  interest  including  the  Gamma 

process.  We  illustrate  our  LDP  results  for  this  process  and  a  related 
process  called  the  Dirichlet  process. 

The  organization  of  this  paper  is  as  follows:  Preliminary 
definitions  and  general  results  on  the  LDP,  which  are  used  in  later 
sections,  are  given  in  Section  2.  A  rate  function  on  M[0,1],  the  space  of 
finite  measures  on  [0,1],  is  defined  and  several  theorems  concerning  this 
rate  function  are  proved  in  Section  3.  In  Section  4,  the  LDP  is 
established  for  stochastic  processes  with  stationary  and  positive 
independent  increments  which  are  considered  as  elements  of  M[0,1].  In 
Section  5,  the  general  LDP  results  are  given  for  stochastic  processes  with 
stationary  independent  increments  and  no  Gaussian  component  which  are 
considered  as  elements  of  BV[0,1],  the  space  of  functions  of  bounded 
variation.  The  final  section.  Section  6,  is  devoted  to  applications  to  the 
Poisson,  Gamma  and  Dirichlet  processes. 


Definitions  and  General  Results. 


Let  X  be  a  topological  space  and  F  be  the  Borel  a-field  inX.  Lee 
(p^j  be  a  family  of  probability  measures  on  (X,F).  The  following 
definitions  which  are  slight  variants  of  those  of  Varadhan  (1983)  allow  us 
to  state  many  large  deviation  results  in  concise  form. 


Definition  2.1.  A  function  !(•)  on  X  is  said  to  be  a  regular  rate  function 


(2.1) 


0  <  I ( x)  <  •  , 


(2.2)  1(0  is  lower  semi-continuous  (lsc),  and 

(2.3)  for  each  c  <  •,  rc  -  {x:I(x)<c}  i3  compact. 

For  any  subset  A  of  X,  define 

(2.4)  1(A)  -  inf  I(x). 

xcA 

Definition  2.2.  The  measures  { P . }  satisfy  the  large  deviation 


principle  (LDP  or  LD  principle)  with  rate  function  1(0  if 


(2.5) 


1(0  is  a  regular  rate  function, 


(2.6) 


for  each  closed  set  F, 


lim  -  log  P  (F)  <  -  1(F),  and 


(2.7) 


for  each  open  set  G, 


lim  j  log  PX(G)  >  -  1(G), 


where  here  and  throughout  the  remainder  of  this  paper  the  limits  are  as 


Definition  2.3.  The  measures  {P^}  satisfy  the  weak  large  deviation 
principle  (WLDP  or  the  weak  LD  principle)  with  rate  function  I(*)  if  (2.5) 
and  (2.7)  of  Definition  2..T  together  with  (2.8)  below  are  satisfied: 

(2.8)  for  each  compact  set  K, 

Tim  j  log  P  (K)  <  -  I(K) . 

Definition  2.4.  The  measures  {P^}  are  large  deviation  tight  (LD 
tight)  if,  for  each  M  >  »,  there  exists  a  compact  set  KM  such  that 

(2.9)  Tim  j  log  P^(Km°)  <  -  M. 

The  following  lemma  shows  the  usefulness  of  LD  tightness. 

Lemma  2.5.  Let  { P^ }  be  LD  tight  and  satisfy  the  WLDP.  Then  it 
satisfies  the  LDP. 

Proof.  Let  C  be  closed  and  let  l  <  1(C).  Let  M  >  Z  and  choose  a  compact 
set  ify  to  satisfy  (2.9).  Then  CUK^  is  compact  and  P  (C)  < 

A 

P. (COKm)  ♦  P, (Kmg). 

A  A 

Thus, 

Tim  7  log  P,(C)  <  -  min  {i(CHk.,),m}  <  -  i.  □ 

Many  interesting  applications  in  large  deviations  occur  when  X  is  a 
Polish  space,  that  is  a  separable  complete  metric  space.  Accordingly,  we 
will  assume  that  all  spaces  we  consider  in  the  rest  of  this  paper  to  be 
Polish  spaces,  and  the  corresponding  o-fields  to  be  Borel  o-fields. 


.  •  .  ■  ►  »  -  %  "  » 


For  sequences  of  probability  measures  on  a  Polish  space  the 
following  lemma,  which  will  not  be  referred  to  in  the  remainder  of  the 
paper,  shows  that  the  LDP  implies  LD  tightness.  Consequently,  the  LDP  is 
equivalent  to  the  WLDP  and  LD  tightness  along  subsequences. 


Lemma  2.6.  If  { P^ }  is  a  sequence  of  probability  measures  which 
satisfies  the  LDP,  then  {P^}  is  LD  tight. 


Proof.  Let  {xi,i-1 ,2,... |  be  a  countable  dense  set  in  X.  For  any  6  >  0, 


let  A<(5)  be  the  open  sphere  of  radius  6  around  x<.  ThenUAi(1/k)  -  X  for 

i 

k  -  1,2,...  Fix  M  >  0  and  an  integer  k.  Consider  the  compact  set  T2kM  ■ 
{x:I(xK2kM} .  There  exists  a  finite  open  covering 


A(k)  -  U  A± ( 1 /k) 
i-1 


of  T2kM*  Thus,  from  (2.6) 


lim  XJP  (A°(k))  <  -  I( Ac(k) )  <  -  I(rc2kM>  <  “  2kM. 


Since  we  are  considering  only  sequences  { X }  we  can  find  a  larger  finite 


union 


B(k)  -  U  At(1/k) 
i-1 


with  J|<  >_  Ik  such  that 


P. (Bc(k) )  <  e* 


V.  /.  -■ 


for  all  X.  The  set  K  *  n  B(k),  where  B(k)  is  the  closure  of  B(k),  is 

k-1 

totally  bounded  and  closed,  and  hence  is  compact.  Furthermore 

(2.10)  PX(KC)  £  "  Px(B°(k))  <  iXM/(l-eX°M) 

k-1 

for  all  X,  where  Xq  is  the  smallest  index  in  the  sequence  { X } .  This 
completes  the  proof  of  Lemma  (2.6).  □ 

Let  Ip^1}  be  a  family  of  probability  measures  on  a  Polish 
space  X1,  i  -  1 ,2.  Let  -  P^1  x  P^2  be  the 

product  measure  on  the  product  space  X-  X1  x  X2.  We  will  now  investigate 
whether  LD  properties  of  marginal  measures  carry  over  to  the  product 
measures. 

Lemma  2.7.  If  { P^ 1 }  is  LD  tight  for  i  -  1 ,2,  then  {P^}  is 
LD  tight. 

Proof.  Obvious. 

Lemma  2.8.  Let  {P^1}  satisfy  the  WLDP  with  rate  function  Ii(xj_), 
i  -  1,2.  Then  {P^}  satisfies  the  WLDP  with  rate  function  I(xi,x2) 

-  I'Ut)  ♦  I2(x2). 

Proof.  It  is  easy  to  check  the  regularity  of  I(xi,x2)  from  the  regularity 
of  I1 (xi )  and  I2(x2).  Let  K-X  be  compact  and  let  £  <  I(K).  For  each 


(xi,X2)eK,  since  I(*)  is  lsc,  there  are  open  sets  O-^  in  X* 

i 

containing  x^,  i  «  1 ,2,  such  that 


(2.11)  inf  { Kyi  ,y2> :  (yi.y2^e  q1x  x  °2x  1  > 

1  2 


Furthermore,  since  X*  is  Polish,  we  can  find  open  subsets  of 

i 

0ix  such  that  xieN*x  and  c  . 

i  i  i  i 

Consider  the  open  covering  It  N1  x  x  of  K.  We 

(xi,x2)£K  1  2 

can  extract  a  finite  subcovering  f:  N'y  x  N^Y  of  K. 

a-1  1 ,m  2,m 

Let  K1  and  be  the  projections  of  K  in  X1  and  X^.  Then  K*  and 
M*x  *  N*x  OK1  are  compact,  m  -  1,...,M  and  i  -  1 ,2. 

i  f  m  1 1  cn 

Furthermore ,  K  c  u  M^x  x  M^x 

l,m  2,m 

Thus,  since  M*,,.  is  compact  and  {P^M  satisfies  the 

i,m 

WLDP , 

Urn  r  log  P  (K)  <  -  min  (I1 CM1  )^I2(M2  )) 

A  m  1,m  2,m 

<  -  l 

in  view  of  (2.11).  This  proves  (2.8). 

Let  0  be  an  open  set  in  X.  Fix  e  >  0  and  choose  (xi.xg)  so  that 

Kxi.xg)  <  1(0)  +  e.  There  exist  open  sets  0X  in  X1  around  x^, 

i 

i  -  1 ,2  such  that  0X  x  0X  c  0.  Thus 

1  2 

2 

lim  |  log  P  (0)  >  Z  lira  {  log  P /(0  ) 

_  A  A  X  ^  A  A  X  ^ 

>  -  Kx]  ,x2)  >  -  1(0)  -  e. 

Since  e  >  0  is  arbitrary,  this  establishes  (2.7)  which  completes  the  proof 
of  Lemma  2.7.  a 
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The  following  corollary  follows  from  Lemmas  2.5,  2.7  and  2.8. 
Corollary  2.9.  Lee  {P  be  LD  tight  and  satisfy  the  WLDP,  i  -  1,2. 

Then  P  -  P  1  x  P  2  satisfies  the  LDP. 

\  \  A 

Two  important  and  immediate  derivatives  of  the  LDP  are  the 
contraction  principle,  which  is  used  later  in  this  paper,  and  the 
asymptotic  expression  for  certain  integrals.  These  are  stated  below.  For 
proofs  see  Varadhan  (1966,  1983)* 

Let  {P^}  satisfy  the  LDP  with  rate  function  I(x).  Let  h  be  a 
continuous  map  from  X  into  another  Polish  space  V ,  and  let 

•  V  • 


Contraction  Principle.  The  measures  {Q, }  satisfy  the  LDP  with  rate 

A 

function 

(2.12)  K(y)  «  inf  I(x). 

x:h(x)-y 

Asymptotic  expression  for  certain  Integrals.  Let  F  be  a  bounded  real 
valued  continuous  function  on  X.  Then 

(2.13)  j  log  /  expUF(x) )  dP^U)  ♦  sup  [F(x)-I(x)]. 

x 

It  is  interesting  to  note  the  definition  of  the  LDP  and  LD  tightness 
together  with  their  consequences,  namely  (2.12)  and  (2.13)  above,  run 
parallel  to  the  definition  of  weak  convergence  and  tightness  (see 
Billingsley  1968)  together  with  their  consequences,  namely  the  continuous 
mapping  principle  and  convergence  of  integrals  of  bounded  continuous 


functions. 


Let  X  be  a  real  valued  random  variable  and  let 


(3.D  -  E(e0X)  <  • 

for  |  9  j  <  n  where  n  >  0.  Let  iKe)  ■  log  4>(6). 

Define 

(3.2)  J(a)  -  JY(a)  -  sup  [at-ij>(t)]. 

*  t 

The  function  J(a)  is  loosely  called  the  rate  function  associated  with  X. 
More  precisely,  let  Pn  be  the  distribution  of  (Xi  +  . .  .+Xn)/n  where  , 

X2,  ...  are  i.i.d.  copies  of  X.  The  following  is  the  oldest  theorem  in 
large  deviation  theory  and  is  variously  referred  to  as  Cramer’s  theorem  and 
Chernoff's  theorem. 

Theorem  3.1.  (Cramer, 1938;  Cherrioff,  1952).  The  distributions  { Pn }  are  LD 
tight  and  satisfy  the  LDP  with  rate  function  J(a). 

The  following  facts  concerning  the  function  J(a)  are  easy  to  obtain 
from  its  definition  in  (3.2): 

(3.3)  0  <  J(a)  <  «,  J(y)  -  0  where  E(X)  -  y  and  J(a)  ♦  »  as  I  a  I  ♦  ®. 

(3.^)  J(a)  -  sup  [ at— ip( t ) ]  if  a  >  y. 

t>0 


(3.5) 


J(a)  is  convex 


(3-6) 


lira  iiii  -  C.  and  lim  -  C,  exist,  where  0  <  C.  ,  C,< 
a-®  a  1  a-*—  I  a  I  d  1  * 


(3.7)  The  function  g(b)  defined  by 


g(b) 


is  convex  on  [0,»). 


bJ(1/b) 

C1 


if  0  <  b  < 
if  b  -  0 


(3.8)  If  the  support  of  X  is  [0,«),  then  J(0)  <  •  if  and  only  if 
P(X-O)  >  0. 


We  will  now  obtain  an  illustration  of  the  contraction  principle  which 
will  be  used  in  Section  5  to  identify  the  LD  rates.  Let  X  -  X^1)  -  X^2^ 
where  and  X^2)  are  independent  non-negative  random  variables.  Under 

assumption  (3.1),  the  moment  generating  functions  ^ 1  ^ ( e )  of  X^  exist  in 
a  neighborhood  of  0,  i  »  1,2.  Let  ij/^O)  -  log  ^^(e)  and  define  the 
rate  function  j(^(a)  of  X^'  analogously  to  (3.2),  i  -  1,2.  Fron.  Theorem 
3.1  and  Corollary  2.9,  the  distributions  of  the  arithmetic  means  of  i.i.d. 
copies  of  the  bivariate  random  variable  (X^.X^2))  satisfy  the  LD 
principle  with  rate  function  J^)(xi)  +  J^2^(x2).  From  the  contraction 
principle  we  obtain  the  useful  result 

(3.9)  J(a)  -  inf  (J(1 } (a+b)+J(2) (b) ) 


i  -  1,2. 
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(i)  J(U(a) 

(3.10)  C  -  lim  - , 

a-*« 


We  will  now  show  that 


(3.1D  q  -  C<1>,  i  -  1,2 

where  q (  C2  are  as  defined  in  terms  of  J(a)  in  (3*6).  Note  that 
4,(9)  -  4,(1  )(e)  +  4j(2)(-0)  and  that  4»(2)(0)  <  0  for  9  <  0  since  X<2)  i3 
non-negative.  Thus 

(3.12)  ae  -  4»(1)(8)  <  J(a)  <  J^)(a+b)  +  J<2)(b) 

for  all  0  >  0  and  all  b.  From  (3.4) 

J^(a)  -  sup  [aO-4;^1  ^(0)] 

8>0 

for  large  a.  Dividing  (3.12)  by  a  and  allowing  a  to  tend  to  «,  we  obtain 
q  -  C<D.  Similarly  C2  -  C<2). 

Let  M[0,1]  be  the  space  of  finite  measures  on  ([0,1], 8)  where  8  is 
the  usual  Borel  o-field  in  [0,1],  For  any  element  f  in  M[0,1],  we  define 
its  distribution  function  f(t)  by  letting  f(0)  •  0,  f(t)  -  f([0,t]), 

0  <  t  <  1 .  We  also  use  the  same  symbol  f  to  denote  both  the  measure  f (A) 
and  the  (extended)  distribution  function  f(t). 

Let  a  be  a  probability  measure  on  [0,1],  that  is  a  e  M[0,1]  and  a(1) 
-  1.  Let  0  -  tg  <  tj  <  . . .  <  ■  1 .  Both  the  collection  of  points 

{ tg ,  t , . . . , q]  and  the  collection  of  intervals 

{[0,q ], ( q ,t2] , . . . , (tfc-i ,1 ] }  will  be  referred  to  as  the  partition  P.  Let 
o(P)  be  the  a-field  generated  by  the  intervals  in  the  partition  P.  The 
partitions  { P}  from  a  directed  set  under  the  partial  order  p'  >  P  if 


lVU 


-  .  *  *.  •.  -  -  •.  %  %  . 


o(  p' )  ^  <j(P).  We  will  be  taking  limits  of  functions  on  {p}  and  it  will 
always  be  along  directed  nets  such  that  o(P)  +  B. 

Let  f  e  M[0,1]  and  P  be  a  partition.  We  define 


(3.13)  Ip(f>  - 


f(t.)-f(t.  .) 

l  J  (a(t1)-a(t1_1))  (a(ti)“a(ti-1)) 


if  a(tj)  -  a(tj-j)  -  0  implies 

f ( tt )  -  f(ti-i )  -  0,  i-1 . k, 


otherwise, 


wherein  J(a)  is  the  rate  function  of  some  non-negative  random  variable  X 
satisfying  (3.1)  and  we  observe  the  convention  0* (undefined)  *  0  and 
0*“  *>  0.  The  rest  of  this  section  is  devoted  to  obtaining  many  important 
properties  of  Ip(f)  which  are  useful  in  obtaining  the  LDP  results  of 
Section  iJ. 

Denote  the  restriction  of  the  measures  a  and  f  to  o(P)  by  otp  and  fp, 
respectively.  We  may  rewrite  the  definition  in  (3.13)  by 


(3.1*0 


Ip(f) 


df 

/ J(t^)  da  if  « 

dap  P 


otherwise. 


Let  f  -  f i  +  f2  be  the  Lebesgue  decomposition  of  f  with  respect  to  a, 
with  fi  <<  a  and  fg  ]_  a.  Let  L  c[0,1]  be  such  that  f2(L)  -  f2([0,1])  and 
a(L)  -  0.  Similarly  define  a-j  ,  <i2  and  M  by  a  •  a]  ♦  02.  a]  <<  f,  02  J_  f. 


a2(M)  -  a2(C0,1])  and  f(M)  -  0.  Let  fi  -  _JL  and  ai  » 
f}  -  1/dt]  >0  a.e.  on  (LUM)C  with  respect  to  f  and  a. 


dot. 

_ L.  Then 

df 


.s  >.  .VA  %  S  *,  A  .-.  ,v  .s  A  s  A  , 


(3.18) 


(3.22) 


1(f)  -  /  g(<*i)  df  +  J(0)  a2  ([0,1]). 


It  is  possible  that  -  -  or  J(0)  -  ®  or  both  and  so  we  consider  the 
following  cases  to  complete  the  proof: 


(i) 

1(f)  - 

m 

(ii) 

1(f)  < 

®  and  f 2 ( CO , 1 ] )  -  0 

(iii) 

1(f)  < 

•  and  a2(C0,1])  “  0 

(iv) 

1(f)  < 

®,  f2(C0,1])  >  0  and  a2(C0,1])  >  0. 

Case  (1).  In  this  case  from  (3.15),  /J(f^)da  -  *  or  C-|  *  f  [0 » 1  ] )  *  •  or 
both.  When  /JCf^da  -  «,  (3.17)  and  Fatou's  lemma  imply  that  Ip(f)  ♦  -. 
When  Ci*f2([0,1])  -  ®,  we  have  /g(ai )df  -  ®.  From  (3*18),  (3*20)  and 
Fatou's  lemma,  we  once  again  obtain  Ip(f)  +  ®. 

Case  (ii).  In  this  case  f  <<  a  and  we  can  adjoin  the  limit  (f i ,8 )  to  the 
dfp 

martingale  {— I,a(P)}.  The  function  J  is  convex  and  from  (3-15),  JC f  i )  is 
daP  dfp 

a-integrable.  This  implies  that  is  uniformly  integrable.  It 

aotp 

therefore  follows  that  1(f)  •+  1(f). 

,daP  , 

Case  ( iii) .  In  this  case  a  «  f  and  l_!,o(P)}  is  a  martingale  under  f  to 

dfp 

which  can  be  adjoined  its  limit  { a-j  ,8 } .  The  function  g  is  convex  and  from 

dfP 

(3.22),  gtaO  is  f-integrable.  This  implies  that  g(-^-)  is  uniformly 
integrable.  Again,  it  follows  that  Ip(f)  ♦  1(f). 

Case  (lv).  In  this  case  J(0)  <  *  and  K  <  ®,  hence  the  functions  J  and  g 
are  bounded  on  [0,y]  and  [0,u_1),  respectively.  Using  the  definitions 
(3.19)  and  (3.21)  and  the  bounded  convergence  theorem,  we  have  Ip(f)  + 


Remark.  In  Theorem  3.1  we  have  actually  shown  that 


•v*  O  -v"  «  '  . 


& 


(3.23)  sup  I„(f)  -  1(f), 

P  K 


The  next  two  lemmas  establish  the  fact  that  1(f)  is  a  regular  rate 
function  with  respect  to  the  weak*-topology  on  M[0,1].  A  sequence  fn  in 
M[0 , 1  ]  converges  in  the  weak*-sense  to  f  if  fn(t)  f(t)  for  each  t  at 
which  f  is  continuous.  Following  tradition,  we  will  call  the 
weak*-topology  as  the  weak  topology  in  the  rest  of  this  paper. 


Lemma  3.3.  The  function  1(f)  is  13c  in  the  weak  topology. 


Proof.  Fix  f  e  M[0,1].  Let  fn  ♦  f  weakly.  We  need  to  show  that 


(3-24) 


lim  I ( f  )  >  1(f). 
n  - 


If  the  support  of  f  is  not  contained  in  the  support  of  a,  then  1(f)  - 
•  and  there  exists  a  weak  open  neighborhood  G  of  f  containing  only  measures 
whose  supports  are  not  included  in  the  support  of  a.  Then  fn  e  G  for  all 
large  n  and  thus  lim  I(fn)  -  ®,  which  establishes  (3-24). 

If  the  support  of  f  is  contained  in  the  support  of  a,  choose  a 
partition  P  -  {o-tg, ti , . . . , t^-1 }  consisting  of  continuity  points  of  f. 

Then  fn(tj_)  ♦  f(ti)  for  each  i,  and  thus  lim  Ip(fn)  -  Ip(f).  From  (3.23). 
Ip(fn)  <  I(fn).  T!lu3  By  allowing  o(P)  to  tend  to 

8  along  such  partitions  and  using  Theorem  3.2,  we  obtain  (3.24).  a 


Lemma  3.4.  Let  c  <  «.  The  set 


(3-25) 


rc  -  {f s  Kf)  <  c} 


V.' 


B 


t*  • 

g 

« 


is  compact. 

Proof.  Consider  the  partition  P  -  { 0 , 1 } .  We  have 
J(f([0,l]))  -  Ip(f)  <  1(f)  <  c 

for  f  e  rc.  Since  J(a)  -*■  *  as  a  ♦  ®,  we  can  find  d  <  •  3uch  that 
rc  c  where  A  <3  -  { f :  f([0,1])  <  d|.  The  set  A<j  is  weakly  compact  and 

from  Lemma  3.3  the  set  Tc  is  weakly  closed.  Hence  rc  is  weakly  compact.  □ 
The  following  minimax  theorem  is  the  driving  force  behind  the  upper 
bound  of  the  LD  results  of  the  next  section. 

Theorem  3.5.  Let  F  be  a  weakly  closed  subset  of  M[0,1],  Then 

(3.26)  sup  I  (F)  -  1(F), 

P 

where  for  any  set  A 

I  (A)  -  inf  1(f)  and  1(A)  -  inf  1(f). 

7  feA  7  feA 

Proof.  From  (3.23)  we  immediately  have 

sup  1(F)  <  1(F). 

P  7 


Suppose  that  (3-26)  were  not  true;  then  there  exists  an  n  <  «  such  that 


(3.27)  sup  I_(F)  <  n  <  1(F) • 

P 
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Thus,  for  each  partition  P-  {o— to<t-|<. . .  <tJ<-l } ,  we  can  find  fp  in  M[0,1] 
such  that  Ip(fp)  <  n.  The  support  of  such  an  fp  will  be  contained  in  the 

A 

support  of  c.  Let  fp,  called  the  P-linear  form  of  fp  with  respect  to  a,  be 
defined  by 


fp(A) 


k 

-  Z 

i-2 


W"fP(ti-1  5 
aCt^-aCt^ ) 


a(Aft(ti_1  tA  3 ) 


VV 


Then  f  )  -  f  (tj).  0  <  1  <  k, 
and 


Ip(fp)  -  Ip(fp)  -  I(fp). 

Hence  is  a  net  in  the  set  T  which  is  compact  from  Lemma  3.3. 

p  n 

Thus,  there  is  a  cluster  point  fg  of  this  net  and  I(fo)  <  n  from  the  lower 
semi continuity  of  I.  If  we  can  show  that  fg  is  a  cluster  point  of  (fp}, 
it  will  follow  that  fo  belongs  to  F  since  F  is  closed.  Since  I(fo)  <  n, 
this  will  lead  to  a  contradiction  of  (3. 27),  and  the  conclusion  (3.26) 
would  have  been  established. 

Let  P’  -  (o-t'i ,t*2,...,t'^}  be  a  partition  consisting  of  continuity 
points  of  f0.  Fix  e  >  0,  and  let  Npt  be  a  weak  neighborhood  of  fg 
defined  by 


Since  fo  is  a  cluster  point  of  (fp/,  there  is  a  partition  P"  >  P'  such  that 

A  A 

f  e  N  t  if  P  >  P'.  Since  fp  and  f„  agree  on  the  partition  P,  it 

p  p  ,  £  r  r 
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\  V  •  -.  V  v 


follows  that  f  e  N  '  for  P  >  P'  and  that  Tq  is  a  cluster  point 
P  P  »  £ 

of  {fp}.  This  completes  the  proof  of  Theorem  3.4. 


Theorems  3.2,  3.5  and  Lemmas  3.3,  3.4  dealt  with  the  rate  function 
1(f)  which  involved  the  function  J.  It  was  assumed  that  J  was  the  rate 
function  of  a  non-negative  random  variable  X  satisfying  (3.1).  When  these 
results  are  applied  in  the  Section  4  and  5  we  will  restrict  X  to  be 
non-negative  and  infinitely  divisible.  For  this  special  case  the  following 
facts  are  noted  concerning  the  finiteness  of  J(0)  and  .  From  (3.8),  J(0) 
is  finite  if  and  only  if  P(X-O)  >  0.  Thus  J(0)  -  •  for  the  Gamma 
distribution  and  J(0)  -  p  for  the  Poisson  distribution  with  parameter  p. 

On  the  other  hand,  -  •  for  the  Poisson  distribution  and  Ci  -  1  for  the 

Gamma  distribution  with  shape  parameter  1. 

The  results  of  the  rest  of  this  paper  would  be  strengthened  if  we 
could  have  proved  Lemmas  3-3,  3.4  and  Theorem  3.5  in  the  Skorohod  topology 
wherein  the  distribution  functions  T  are  considered  as  elements  of  P[0,1], 
Unfortunately  certain  complications  occur  as  indicated  by  the  following 
remark. 

The  Skorohod  topology  is  stronger  than  the  weak  topology.  Thus  the 
rate  function  1(f)  is  Skorohod  lsc,  and  hence  rc  is  Skorohod  closed. 

However  rc  is  not  Skorohod  compact  as  the  following  example  demonstrates. 


fnU) 


t  n(t  - 


1  ♦  1) 
2  nJ 


0  <  t  <  -  -  - 
-  -  2  n 

1  - 1  <  t  <  i 

2  n  *  c  -  2 


t  +  i 


-  <  t  <  i. 
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Let  J(a)  -  a  -  1  -  log  a,  which  is  the  rate  function  corresponding  to  the 
Gamma  distribution  with  shape  parameter  1.  Let  a  be  the  Lebesgue  measure. 
Then 


1(f)  -  1  -  i  log  ( 1 +n) 
n  n 


and  fn  e  Tj.  •  Note  that  fn  *  f  in  the  weak  topology,  where 


f(t) 


t  +  1 


t  <  1/2 
2  <  t  <  i . 


Since  fn  is  continuous  and  f  has  a  Jump  at  t  •  no  subsequence  of  fn  can 
converge  in  the  Skorohod  topology. 


4.  LD  Rates  for  Stochastic  Processes  with  Stationary 
and  Non-negative  Independent  Increments. 


w  * 


# 


Let  {x(t),0£t£l}  be  a  stochastic  process  with  stationary  and 
non-negative  independent  increments  and  measurable  sample  paths  with 
X(0)  -  0.  Since  the  increments  are  non-negative,  the  sample  paths  of 
(x(t) ,(Kt<1 }  can  be  considered  as  members  of  Note  that  X(1 )  is  a 

non-negative  infinitely  divisible  random  variable. 

We  will  assume  that 

(4.1)  t(6)  -  E(e0X(1))  <  • 

for  some  8  >  0.  Let  4(8)  -  log  4(0)  and  let  J  be  the  rate  function  of  X(1) 
as  defined  in  (3.2).  Let  a  be  a  probability  measure  on  [0,1].  Let  the 
rate  function  1(f)  on  M[0,1]  be  as  defined  in  (3.15).  For  X  >  0,  define 

(4.2)  Zx(t)  -  j  X(Aa([0,t]))  0  <  t  <  1. 

Then  {z^ ( t) ,0<t£1 }  is  a  process  with  values  in  M[0,1],  Endow 

M[0 , 1 ]  with  the  weak  topology  and  denote  the  induced  distribution  of 

{ Z ^ ( t ) ,0<tO }  by  P  .  In  this  section  we  show  that 

{ }  is  LD  tight  (Lemma  4.3)  and  satisfies  the  LDP  with  rate 

function  1(f).  (Theorems  4.1  and  4.2). 

Theorem  4.1,  Let  F  be  a  weakly  closed  subset  of  M[0,1].  Then 

(4.3)  lira  j  log  P  (F)  <  -  1(F). 

Proof.  Let  P  -  {0-to<t1<...<tk-l }  be  a  partition  and  let 
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s’v'VV.l 


(4.4) 


[O.t,] 

^i-1  ,ti^ 


if  i  -  1 


if  i  *  2 ,  . . • ,  k » 


(4.5) 


WX,i  “  Zx(Ai)>  1  <  1  1  k‘ 


Then  (w  are  independent,  and  from  Theorem  3.1  and  Lemma 

A ,  i 


2 . 3  satisfy  the  LDP  with  rate  function 


<4-6)  i J 


P  (F)  -  P(Z  eF)  <  p|lp(Z  )  >  IpCF)  f 


where 


W  -  J  J(^)}  *(V* 

Since  the  support  of  x(1)  is  [0,*)  the  function  J(x)  is  continuous  in  [0,») 
and  J(x)  ♦  •  as  x  *  ®.  Thus  the  set 


{(x1,...,xk):  E  J(^fT)a(AL)  -  IP(F)^ 

is  closed  in  Rk.  Using  the  LDP  of  {W^  ^KKk}  and  its  rate 
function  in  (4.6),  we  obtain 


lira  J  l08  P,(F)  <  -  Ip(P) 
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Since  P  is  arbitrary,  we  can  use  the  minimax  result  in  Theorem  3-5  to 
obtain 

ITS  !  log  P  (F)  <  -  1(F).  a 

Theorem  4.2.  Let  G  be  a  weakly  open  subset  of  M[0,1].  Then 
(4.7)  lim  |  log  P  (G)  >  -  KG). 

Proof.  There  is  nothing  to  prove  if  1(G)  -  ».  Otherwise,  fix  e  >  0  and 
choose  f  e  G  so  that  1(f)  <  1(G)  ♦  e.  There  is  a  6  >  0  and  a  partition 
P  -  {0«tQ<ti<.. .Ctfc-I }  consisting  of  continuity  points  of  f  and  a  such  that 
the  neighborhood 

N  -  {g:  max  |  g(AA)  -  f(Aj)  |  <  5} 

P.e  t 

of  f  is  contained  in  G.  Here  A-|,...,Ak  are  as  defined  in  (4.4).  Thus, 

PX(G)  >  Px (max  |  Wx>.  -  f (Ax)  |  <  &} 

where  {w  are  as  defined  in  (4.5)  and  satisfy  the  LDP  with 

the  rate  function  in  (4.6).  Furthermore  the  set 

G*  -  { x-|  , . . .  .Xfc) :  max  j  xj-f  (A^  j  <5 }  is  open  in  Pk.  Thus 

i 

1  xi 

lim  j  log  PX(G)  >  -  inf  l  J(^*  ))a(A1) 


where  infimum  is  taken  over  the  set  G*.  Hence, 


lia  ~  log  P  (G)  >  -  I  (f)  >  -  1(f)  >  -  1(G)  -  e. 


This  completes  the  proof  of  Theorem  4.2.  a 

Lemma  4.3.  The  family  of  probability  measures  jP^}  is  LD  tight. 

Proof.  This  follows  from  Lemma  2.6.  A  more  direct  proof  is  as  follows. 
The  sets 


KL  -  {f s  f([0.1])  <  L} 

are  compact.  Let  9  >  0  be  such  that  $(0)  <  °».  From  the  Markov  inequality 
we  have 


PA(KLC)  <  exp{-[9L-4»(0)] } 

which  can  be  made  as  small  as  we  please  by  choosing  L  sufficiently  large. 
This  completes  the  proof.  □ 


5.  LD  Rates  for  Stochastic  Processes  with  Stationary  Independent 


Increments  with  no  Gaussian  Component. 


Let  {x(t),(Kt<l}  be  stochastic  processes  with  stationary  independent 
increments  and  measurable  sample  paths  with  X(0)  -  0.  Let  the  infinitely 
divisible  random  variable  X(1)  have  a  finite  moment  generating  function 
ij>( 9 )  which  is  finite  for  j  9  j  <  n  for  some  n  >  0.  Assume  that  X(1)  possess 
no  Gaussian  component. 

From  standard  results  on  infinitely  divisible  distributions 
(eg.  Breiman,  1968,  Chapter  14)  it  follows  that 


4>(9)  -  log  b(9)  -  /(e  -1)dv(x) 

where  the  Levy  measure  v  (possibly  unbounded)  satisfies  /(x|dv(x)  <  *  and 
that  the  sample  paths  of  {x(t) ,CKt>1 }  lie  in  BV[0,1],  the  space  of 
functions  of  bounded  variation  on  CO , 1 ] .  Thus,  we  can  write 

X(t)  -  X^)(t)  -  X<2>(t) 

where  X^’^(t)  and  X^2)(t)  are  two  independent  stochastic  processes  with 
stationary  and  non-negative  independent  increments  with  Levy  measures  for 
x(')(1)  and  X^2)(t)  are  given  by  v^^(A)  -  v(Afl[0,»))  and  v(2^(A)  -  V(-Afl 
(~®,0)),  respectively. 


Let  J,  and  J^2)  denote  the  rate  functions  associated  with  X, 

x(1)  and  X^2).  That  is,  J(a)  -  sup  { ea-\|>( 9 ) }  and  j(^(a)  - 


/( e0x— 1  )dv(*)(x)  is  the  cumulant 


sup  {8a-<^*)(0)}  where  4i(*)(8)  - 
generating  function  of  X^*),  i  -  1,2. 

Let  a  be  a  probability  measure  on  [0,1].  Define 

(5.1)  Z  (t)  -  X~^X(Xo([0,t]))  for  0  <  t  <  1. 

Let  Zx  ^ )  and  be  defined  in  terms  of  X^)(*)  and 

x(2)(*)  in  a  fashion  similar  to  (5.1).  Then, 

(5.2)  Zx(t)  -  Zx^>(t)  -  Zx^2>(t). 

Note  that  { Zx ( t ) :  0 <t<1 }  takes  values  in  BV[0,1]  -  the  space  of 
functions  of  bounded  variation,  or  equivalently,  signed  measures  on  [0,1]. 
Let  f  e  BV[0 , 1 ] .  Let  its  Hahn-Jordan  decomposition  be  given  by 

f  -  h<1>  -  h<2) 

where  h^1),  h^2)  e  m[0,1].  Also  suppose  that 

f  .  f(1 )  -  f(2) 


F? 


•r. 

& 

V? 

w 


where  f^1^,  f^2^  e  M[0,1],  and  for  any  function  p  in  BV[0,1]  let 

i  •  dPi 

p  -  Pi  +  p?  where  pi  <<  a  and  p?  la  and  let  pi  -  _ L.  It  is  clear  that 

da 


f,  -  h^D  -  h,<2)  .  fl(D  -  f  i  ( 2)  f 
f,  -  f^D  -  f,(2> 


(5.3) 
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(5.4)  inf  lf2(i)[0,1]:  f  -  f(i>  -  f  <2) .  fO),  f(2)  e  M[0,1]} 

-  h2(i)(C0,1]),  i  -  1 ,  2. 

The  definitions  of  f  j ,  h2^),  h2^2^  above  will  be  used  in  the 
statement  of  the  theorem,  below,  which  contains  the  main  LD  result  of  this 
paper . 

Theorem  5.1.  Let  P^  be  the  probability  distribution  of 
{z  (t) ,0<t<1 } .  Then  P.  satisfies  the  LD  principle  with  the 
rate  function 

(5.5)  1(f)  -  /  J(f,)  da  ♦  C1h2n)([0, 1])  ♦  C^2)  ([0, 1 J ) 

where  f^,  hg^'),  h2^2^  are  as  defined  before  and  where  C|  and  C2  are  given 
by  (3.6). 

Proof.  Let  p  (i)  be  the  distribution  of  Z  (*)(•)  in 

A  A 

M[0,1],  i  -  1,  2.  Let  g  be  a  function  from  M[0,1]  x  M[0,1]  into  BV[0,1] 
defined  by  g(f^1^,f^2^)  -  f^1)  -  f(2). 

Then  g  is  a  continuous  function  and  -  (P^^xP^2^ )g-1 . 

From  Theorems  4.1,  4.2  and  Lemma  4.3,  P  is  LD  tight  and 

A 

satisfies  the  LDP  with  rate  function 


I(i>(f)  -  f  J^1)(f1)da  +  1 >f2( CO , 1 3 ) 


where  f  ■  f^  +  f2  with  <<  a  and  fj  I  a  and  fi  - _ L  and  where  is 

da 

given  by  (3.10).  From  Corollary  2.9  P  (D  x  P  (2) 

A  A 

satisfies  the  LDP  with  rate  function  i(D(f(D)  +  i(2)(f(2))  for  f(1)t  f(2) 
e  MCO ,1 ] .  From  the  contraction  principle,  P^  satisfies  the  LDP 
with  rate  function 

inf  {/J<1)(f10))da  +  /J<2)(f1  (2))da  +  d1  )f2d  >([0,1]) 

f(1),f(2);  f«f ( 1 )-f (2)  +  f 2^ 2 ^ ( CO , 1 3 ) } 

-  /  J ( f i ) da  ♦  C1h2(1)([0,1])  ♦  C2h2<2> ( [0 , 1 D ) 

in  view  of  (5.3),  (3.11)  arid  (5.H).  □ 


In  this  section  we  evaluate  the  rate  functions  for  three  processes. 


Example  1  -  Poisson  Processes. 


Let  { X C t ) ,0<t£1 }  be  a  Poisson  process  with  constant  intensity  y. 


Define  the  process  {z^(t) ,0<tO }  as  in  (4.2).  Then 

{iZ^(t) ,(Xt£1 }  is  a  Poisson  process  with  intensity  function 

Aya( [0 , t] ) .  The  distribution  of  X(1)  is  Poisson  with  parameter  y  and  thus 


J(a)  -  a  log  -  -  a  y  and  -  «, 


where  J(a)  and  C-|  are  as  defined  in  (3-2)  and  (3.6).  Thus,  as  an 
application  of  Theorems  4.1  and  4.2,  {z^(t) ,CKt£1 }  satisfies  the 
LDP  with  rate  function 


(6.1) 


f  f , log(— ^)da  +  y  -  f([0,1]) 


if  f  «  a 


otherwise 


This  result  can  also  be  derived  from  Varadhan  (1966)  since  -  •. 


Example  2  -  Gamma  Processes.  Let  {x(t),0<t£l}  be  a  Gamma  process,  that  is 
a  stochastic  process  with  stationary  independent  increments  and  measurable 
paths  with  X(0)  -  0  and  such  that  X(1)  has  a  Gamma  distribution  with  shape 
parameter  1 .  Then 


J(a)  -  a  -  1  -  log  a,  J(0)  -  ®  and  C\  -  1 , 

where  J(a)  and  C]  are  as  defined  in  (3-2)  and  (3*6).  Then  the  process 
{z  (t),0<t<l}  as  defined  in  (4.2)  satisfies  the  LDP  with 

A  ““  “ 


f([0,1])  -  1  -  /  log  ^do  if  f,  ■  a 


otherwise. 


Example  3  ~  Dirlchlet  Processes.  Consider  the  process 
{wx(t).0<t<1 }  where  W;(t)  -  Z  C t)/Z  ( 1 ) 

where  is  as  defined  in  Example  2.  Then  {w^ ( t) ,0<t<1 f  is 
the  Dirlchlet  process  with  parameter  Xa(0  as  defined  in  Ferguson  (1973). 
Sethuraman  and  Tiwari  (1982)  have  shown  that  as  X  ♦  0,  converges 
in  distribution  to  Wg  where  Wg  is  the  random  probability  measure  6y(*) 
where  6a(*)  stands  for  the  degenerate  measure  at  a  and  Y  is  a  random 
variable  with  distribution  a.  However,  if  we  let  X  *  •,  then  W^ 
converges  to  the  constant  a  in  MC0,1],  The  contraction  principle  and  the 
LDP  for  the  Gamma  process  show  that  the  Dirlchlet  process  with  parameter  la 
satisfies  the  LDP,  as  x  •*  ■,  with  the  rate  function 


K(a,f) 


if  f(1)  -  1  and  f  ■  a 


otherwise. 


where  K(a,f)  is  the  Kullback-Leibler  information  number  between  two 
probability  measures  a  and  f  defined  by 


K(a.f) 


!  loe  £  *• 


•*,  •*  * 
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